AITopics | advantage-based auxiliary reward

Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

Neural Information Processing SystemsDec-25-2025, 15:51:35 GMT

Hierarchical Reinforcement Learning (HRL) is a promising approach to solving long-horizon problems with sparse and delayed rewards. Many existing HRL algorithms either use pre-trained low-level skills that are unadaptable, or require domain-specific information to define low-level rewards. In this paper, we aim to adapt low-level skills to downstream tasks while maintaining the generality of reward design. We propose an HRL framework which sets auxiliary rewards for low-level skill training based on the advantage function of the high-level policy. This auxiliary reward enables efficient, simultaneous learning of the high-level policy and low-level skills without using task-specific knowledge. In addition, we also theoretically prove that optimizing low-level skills with this auxiliary reward will increase the task return for the joint policy. Experimental results show that our algorithm dramatically outperforms other state-of-the-art HRL methods in Mujoco domains. We also find both low-level and high-level policies trained by our algorithm transferable.

advantage-based auxiliary reward, hierarchical reinforcement learning, low-level skill, (3 more...)

Neural Information Processing Systems

Genre: Research Report (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Add feedback

Reviews: Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

Neural Information Processing SystemsJan-25-2025, 04:35:27 GMT

This is an interesting approach and seems novel in the context of options, although it looks to have some similarities to potential based reward shaping, e.g. (Devlin and Kudenko, 2012). The main advantages claimed for HAAR are (loosely) those of improved performance under sparse rewards and the learning of skills appropriate for transfer. These claims could be made more explicit, and that might help to justify the experimental section. The authors define advantage as: A_h(s_t h,a_t h) E[r_t h \gamma_h V_h(s_{t k} h) - V_h(s_{t} h)] The meaning of this is a little ambiguous and I would prefer this to be clarified.

advantage-based auxiliary reward, hierarchical reinforcement learning, representation, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Reviews: Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

Neural Information Processing SystemsJan-25-2025, 04:18:29 GMT

The paper presents HAAR - a hierarchical reinforcement learning approach that is based on the idea of using the advantage / temporal difference error of the high-level controler provide the reward signal for the lower layer. The reviewers judged this approach to be novel, and empirical results are promising. Analytical results provide improvement guarantees similar to a base algorithm like TRPO. Several areas for improvement were mentioned, and many of these were addressed in the rebuttal. For example, the reviewers were pleased to see the additional experiment showing performance from random skill initialization.

advantage-based auxiliary reward, hierarchical reinforcement learning, observability, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

Neural Information Processing SystemsOct-10-2024, 09:41:47 GMT

Hierarchical Reinforcement Learning (HRL) is a promising approach to solving long-horizon problems with sparse and delayed rewards. Many existing HRL algorithms either use pre-trained low-level skills that are unadaptable, or require domain-specific information to define low-level rewards. In this paper, we aim to adapt low-level skills to downstream tasks while maintaining the generality of reward design. We propose an HRL framework which sets auxiliary rewards for low-level skill training based on the advantage function of the high-level policy. This auxiliary reward enables efficient, simultaneous learning of the high-level policy and low-level skills without using task-specific knowledge.

advantage-based auxiliary reward, hierarchical reinforcement learning, low-level skill, (1 more...)

Neural Information Processing Systems

Genre: Research Report (0.44)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Add feedback

Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

Li, Siyuan, Wang, Rui, Tang, Minxue, Zhang, Chongjie

Neural Information Processing SystemsMar-18-2020, 21:01:22 GMT

Hierarchical Reinforcement Learning (HRL) is a promising approach to solving long-horizon problems with sparse and delayed rewards. Many existing HRL algorithms either use pre-trained low-level skills that are unadaptable, or require domain-specific information to define low-level rewards. In this paper, we aim to adapt low-level skills to downstream tasks while maintaining the generality of reward design. We propose an HRL framework which sets auxiliary rewards for low-level skill training based on the advantage function of the high-level policy. This auxiliary reward enables efficient, simultaneous learning of the high-level policy and low-level skills without using task-specific knowledge.

advantage-based auxiliary reward, hierarchical reinforcement learning, low-level skill, (1 more...)

Neural Information Processing Systems

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Add feedback

Filters

Collaborating Authors

advantage-based auxiliary reward

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

Reviews: Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

Reviews: Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards